Perfect hashing using sparse matrix packing
نویسندگان
چکیده
This article presents a simple algorithm for packing sparse 2-D arrays into minimal I-D arrays in O(r?) time. Retrieving an element from the packed I-D array is O(l). This packing algorithm is then applied to create minimal perfect hashing functions for large word lists. Many existing perfect hashing algorithms process large word lists by segmenting them into several smaller lists. The perfect hashing function described in this article has been used to create minimal perfect hashing functions for unsegmented word sets of up to 5000 words. Compared with other current algorithms for perfect hashing. this algorithm is a significant improvement in terms of both time and space efficiency.
منابع مشابه
Using Tries to Eliminate Pattern Collisions in Perfect Hashing
4any current perfect hashing algorithms suffer from the problem of pattern collisions. In this paper, a perfect hashing technique that uses array-based tries and a simple sparse matrix packing algorithm is introduced. This technique eliminates all pattern collisions, and because of this it can be used to form ordered minimal perfect hash functions on extremely large word lists. This algorithm i...
متن کاملA Letter-oriented Perfect Hashing Scheme Based upon Sparse Table Compression
In this paper, a new letter-oriented perfect hashing scheme based on Ziegler’s row displacement method is presented. A unique n -tuple from a given set of static letter-oriented key words can be extracted by a heuristic algorithm. Then the extracted distinct n -tuples are associated with a 0/1 sparse matrix. Using a sparse matrix compression technique, a perfect hashing function on the key word...
متن کاملIndexing Internal Memory with Minimal Perfect Hash Functions
A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values, which are in turn used to index a hash table. Since no collisions occur, each key can be retrieved from the table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. MPHFs are wide...
متن کاملSparse signal recovery using sparse random projections
Sparse signal recovery using sparse random projections
متن کاملFeature Hashing for Language and Dialect Identification
We evaluate feature hashing for language identification (LID), a method not previously used for this task. Using a standard dataset, we first show that while feature performance is high, LID data is highly dimensional and mostly sparse (>99.5%) as it includes large vocabularies for many languages; memory requirements grow as languages are added. Next we apply hashing using various hash sizes, d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Syst.
دوره 15 شماره
صفحات -
تاریخ انتشار 1990